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Summary. While death rates due to diseases of the heart have experienced a sharp 
decline over the past 50 years, these diseases continue to be the leading cause of death in 
the United States, and the rate of decline varies by geographic location, race, and gender. 
We look to harness the power of hierarchical Bayesian methods to obtain a clearer picture 
of the declines from county-level, temporally varying heart disease death rates for men and 
women of different races in the US. Specihcally, we propose a nonseparable multivariate 
spatio-temporal Bayesian model which allows for group-specihc temporal correlations and 
temporally-evolving covariance structures in the multivariate spatio-temporal component of 
the model. After verifying the effectiveness of our model via simulation, we apply our model 
to a dataset of over 200,000 county-level heart disease death rates. In addition to yielding a 
superior £t than other common approaches for handling such data, the richness of our model 
provides insight into racial, gender, and geographic disparities underlying heart disease death 
rates in the US which are not permitted by more restrictive models. 
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1 Introduction 


Despite substantial reductions in death rates since the mid-1960s (e.g., Sempos et ah, 1988 


Ford and Capewell, 2011 Young et ah, 2010 Greenland et ah, 2007[), heart disease 


remains 


the leading cause of death in the United States (US, Murphy et ah, 2013). Work by Casper 


et ah (2015) has identified that while the nation as a whole has experienced substantial 


declines in heart disease mortality rates, there has been a substantial geographic shift over 
time, as mortality rates in the northeast have declined at a much faster rate than those 
in the Deep South. Previous work has also shown disparities in heart disease death rates 


between the sexes (e.g., Sempos et ah, 1988 [Kramer et ah 2015), between races (e.g., Kramer 


et ah, 2015), and geographically (e.g., Gillum et ah, 2012 Vaughan et ah, 2014, 2015), yet 


accounting for these various sources of disparities simultaneously has yet to be considered. 
Here, we look to build upon the existing heart disease literature to obtain a broader picture of 
these declining death rates using a hierarchical Bayesian statistical approach which accounts 
for correlation spatially, temporally, and between race/gender groups. 

There is an extensive literature on the subject of space-time modeling, particularly in 
the Bayesian context. A common approach for modeling discrete — or areal — spatial data 


is the use of the conditionally autoregressive (GAR) model proposed by Besag (1974) and 


later popularized in the disease mapping context by Besag et ah (1991). Early uses of the 


GAR model in the space-time setting include Waller et ah (1997) and Knorr-Held and Besag 


(1998) — both of which analyzed rates of lung cancer in Ohio counties — and Gelfand 


et ah (1998), whose interest pertained to the sale prices of homes. While these methods 


have used separable model structures for space and time, Knorr-Held (2000) discusses the 
use of nonseparable space-time models in a discrete space, discrete time setting with an 
application to lung cancer mortality rates in Ohio. In addition to space-time data models. 


Gelfand and Vounatsou (2003) and Garlin and Banerjee (2003) have developed methods for 


general multivariate spatial models. For a more complete coverage of the recent advances in 


spatial and space-time modeling, see Gressie and Wikle (2011) and Banerjee et ah (2014). 
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The concept of multivariate space-time (MST) models for discrete spatial data has also 


been explored previously. For instance, Congdon (2002) modeled suicide mortality rates in 
the boroughs of London using spatially varying regression coefficients and a nonparametric 


specihcation of the random effects, and Daniels et ah (2006) developed a conditionally spec- 
ihed model for the analysis of particulate matter and ozone data collected from monitoring 


sites in Los Angeles, CA. More recently, Bradley et ah (2014) proposed an alternative in 


which the authors use a shared component model (e.g., Knorr-Held and Best, 2001 Tzala 


and Best, 2008) with a reduced-rank spatial domain, extending the approach of Hughes and 


Haran (2013) to the MST setting. While a shared component model can offer substantial 


computational benehts by effectively reducing the complexity of a MST model to that of a 
reduced-rank space-time model, this assumption may not always be appropriate (e.g., when 
the available covariate information is insufficient to capture the differences in the geographic 
patterns) nor necessary (e.g., when number of groups in the multivariate structure is small). 

Due to the recent geographic and temporal evolutions in heart disease death rates, the 
methodological goal of this paper is to dehne a nonseparable multivariate space-time mod¬ 
eling framework to analyze the heart disease mortality rate data described in Section We 
propose our model in Section and demonstrate its ability to accurately estimate model 
parameters via simulation study in Section We then analyze our heart disease mortality 
data in Section where we observe temporally-evolving variance parameters inconsistent 
with the previously used separable model. Finally, we summarize our hndings and offer some 
concluding remarks in Section 


2 Data Description 

The study population for this analysis includes US residents, ages 35 and older, who were 
identihed on a death certihcate as either black or white — we restrict our analysis to these 
Ng = A groups because these are the only racial groups for whom data are available for the 
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entire duration of our study period. Annual counts of heart disease deaths per county per 
race/gender group were obtained from the National Vital Statistics System (NVSS) of the 
National Center for Health Statistics (NCHS). Due to differences in the manner in which 
death records were processed by NCHS, we restrict the analysis to data from 1973-2010 to 
ensure valid comparisons across time. Deaths from heart disease were dehned as those for 
which the underlying cause of death was “diseases of the heart” according to the 8th, 9th, 
and 10th revisions of the International Classihcation of Diseases (ICD)[^ Based on the works 


of Klebba and Scott (1980) and Anderson et ah (2001), we assume that this dehnition is 


consistent over the 38 year study period. Annual projected population counts were obtained 


from the NCHS (2013), and the numbers of heart disease deaths were age-standardized to 


the 2000 US Standard Population using 10 year age groups. 

The geographic unit used in this analysis was the county (or county equivalent). Given 
changes in county dehnitions during the study period (e.g., the creation of new counties), a 
single set of Ng = 3,099 counties from the contiguous lower 48 states was used for the entire 
study period. In an attempt to stabilize the data, county-level age-standardized counts and 
populations were aggregated into Nt = 19 two year intervals (i.e., 1973-74, 1975-76, etc.). 

3 Methods 

3.1 Review of methods for disease mapping 


One of the seminal papers in the held of disease mapping was the work of Besag et ah (1991). 


Letting V and rij denote the incidence of disease and the population at risk in county i, the 
authors proposed a model of the form 


Yi ~ Pois {rii exp [xi/3 + Zi + (pi]) ,i = 1, 


..,Ng 


( 1 ) 


HCD-8: 390-398, 402, 404, 410-429; ICD-9: 390-398, 402, 404-429; ICD-10: I00-I09, Ill, 113, I20-I51 
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where Xj denotes a p-vector of covariates with corresponding regression coefficients, f3, Zi 
is a spatial random effect, and (pi ^ N (0, r^) is an exchangeable random effect. In their 


work, Besag et ah (1991) modeled Z = (Zi,..., Zjyj' as arising from an intrinsic conditional 
autoregressive (CAR) model, which has the conditional distribution 


Ns 


Ns 


Ns 


Zi I Z(i),a^ ~ N ^WijZj/^Wij^a'^/ ^ 


w, 




( 2 ) 


0=1 


1=1 


1=1 


where Z(j) denotes the vector Z with the 1th element removed and W is an adjacency matrix 
with elements Wij = 1 if 1 and j are neighbors (denoted i ~ j) and 0 otherwise. Later work 


by Knorr-Held (2002) and Hodges et ah (2003) has shown that the joint distribution of Z 


in (|^ is of the form 


TT (Z I OC (CT^) 


2.-{Ns-1)/2 


exp 


2ct2 


Z\D-W)Z 


( 3 ) 


where D is an Ng x Ng diagonal matrix with elements 

Extending ([^-(|^ to a setting consisting of multiple spatial surfaces is straightforward. 
For instance, suppose we wish to map Ng diseases over an area consisting of Ng counties. 
Letting Yik denote the incidence of disease k in county 1, we may assume 


Yik ~ Pois {riik exp [xifc/3fc + Zik + (pik]) , 1 = 1,..., k = l,...,Ng. 


( 4 ) 


To model Z = (Zj.,..., Zjy^y where Zj. = ..., Zi^^y, we may follow the example of 

Gelfand and Vounatsou ( 2003D and let Z ~ MCAR(1, which yields the following: 


7r(Z|Sz) OC 


exp 


■-z'{(Z)-W')®sy}z 


and Zj. I Z(i).,Ez ~ iV ( 1-Ez ) , 


where denotes the Ng x Ng covariance structure for our Ng diseases and 0 denotes the 
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Kronecker product. Extensions of Q to the (multivariate) space-time setting follow similarly 


e.g., Zhu et ah, 2013), with the necessary specihcations of the covariance matrix 


While Poisson models like (|^ are common, they can also pose computational challenges. 
For instance, the full conditional of Zj, given by 


No 


TT 


(Zj I Y, Z(j)., /3, cf), T^z) OC Pois {Yik \ (3^, Zik, 4>ik) x tt (Zj | Zp)., 


( 5 ) 


k=l 


is not a known distribution. That is, if we use a Markov chain Monte Carlo (MCMC) 
algorithm to estimate the posterior distribution of our model parameters, this model may 
require the use of Metropolis steps within our Gibbs sampler. When the number of groups 
is large — or in the space-time setting when W is large — updating Z can be cumbersome. 
Besag et ah (1995) suggests a reparameterization of (|^ which would involve integrating 0*^. 
out of the model, yielding a Gaussian full conditional for Zj, though this model still consists 
of over twice as many parameters as data points. 

One alternative to modeling the counts using a Poisson likelihood is to model the rates 
as being log-normally distributed. For instance, suppose Yikt and Uikt denote the number of 
heart disease-related deaths and the population at risk for the fcth population in county i at 
time t, respectively. We could then model 9ikt = log (Yikt/riikt) using a Gaussian distribution. 
This may be problematic, however, as our data consist of a large number of counties experi¬ 
encing zero deaths related to heart disease for a given population in a given year. As such, 
this may require us to treat Yikt = 0 as data below the limit of detection by substituting 
Yikt = Yikt < 1 or by multiply imputing values for Yikt (e.g., see Fridley and Dixonj |2007[ ). 

In order to avoid the computational burden associated with the Poisson model in ([^ and 
the ill-handling of zeros in the log-normal model, we opt to model the rates themselves as 
Gaussian. That is, we let Yikt denote the age-standardized death rate (per 100,000) in county 
i E {1,..., Ns} during time interval f = {1,..., Nt} for race/gender group /c G {1,..., Ng} 
and we dehne Yj.j to be the vector collecting the Ng observations from time t in the ith 
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county, Yj.. = ..., to be the vector collecting the (NgNt) observations from 

the ith county, and Y = (Y'^..,..., Y^^..)' to be the NgNgNt-vector which stacks all of the 
age-standardized death rates. To model the death rates, we assume 


Yik, ~ N + Z,u, rfia) , i = l,---,N„ k = l,...,N„, i = 1,..., Af, 


( 6 ) 


where is the p x 1 vector of covariates for the ith county at time t with a corresponding 
p X 1 vector of regression coefficients, /3^, Zi^t is a random effect which accounts for the 
spatio-temporal dependence between and within the four race/gender groups, = r^/uikt, 
and riikt denotes the population of group k in county i at time t divided by 100,000. A recent 


example of a model of this form is Quick et al. (2013), where a Gaussian likelihood was used 


to model changes in county-level asthma hospitalization rates in California. We provide a 
defense of the Gaussian assumption for these data in Figure B.l of the Web Appendix. 


3.2 Choices for Hz 


Before we present our proposed MSTCAR model for in Section |3.2.3[ we begin by de¬ 
scribing other natural choices: independence models and a separable model. Not only do 
these models have computational benefits, but they are also special cases of the MSTCAR. 


3.2.1 Independence models 

Based on the multivariate spatial models described in Section |3.1| , one could opt to fit a 
collection of Ng independent space-time models (denoted STCAR) of the form 


Zjfc. I cr^, Pk ^ N i — Zjfc., —R (pk) j , k — 1,..., N, 


(7) 




where R {pk) denotes an Nf x temporal correlation matrix with parameter pk and is 
the variance parameter corresponding to race/gender group k. In addition to accounting for 
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spatiotemporal correlation, a model of the form (|^ with this structure for Z has the added 
computational beneht of being able to be £t in parallel as Ng separate models. This conve¬ 
nience, however, comes at the cost of failing to account for the correlation between groups. 
As we believe there to be a high degree of correlation between the heart disease mortality 
rates of our various race/gender groups, this drawback is particularly disappointing. 

We could also choose to £t Nt independent multivariate spatial models of the form 


Zj.t I Z(j).i, Gt ^ N 





where Gt denotes a temporally-varying Ng x Ng multivariate covariance structure for our 
race/gender groups. While this model can also have substantial computational benehts, the 
assumption of temporal independence is especially damning. 


3.2.2 Separable model 

Driven by the desire to account for both temporal and between-group correlation in our 
spatial model, a separable model of the form 

( 8 ) 


Z(i)..,G, p ~ iV — 




1 

rrii 


where we let R (p) denote an Nt x Nt temporal correlation matrix and G denote the Ng x Ng 
between-group covariance structure, may be attractive. The appeal of a separable model 
where = R{p) ® G is immediately clear: instead of accounting for multivariate temporal 
correlation using an unstructured NgNt x NgNt matrix, we can separate our problem 
into matrices of rank Ng and Nt, reducing the computational complexity of inverting 
substantially. While the criticism of separable models in the spatiotemporal literature is 
primarily directed toward their use in the continuous space, continuous time setting where 


prediction at unobserved locations is of interest (e.g., see Stein, 2005), the lack of a temporally 
evolving Gt or group-specihc pk may be undesirable. 
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3.2.3 The MSTCAR model 


To construct our random effects, Z, we will begin by defining Vt.t~iV(0, Gt) to be a collection 
of independent dimensional random variables with covariance Gt for l = 1,..., {Ng — 1) 
and t = 1, ..., Nt. Note the deliberate use of the subscript l instead of the subscript i] this 
is to reinforce that does not correspond to a particular county. From this, we dehne 
v,fc. = {v,ki, • • •, v,kNt)' and construct 


VLk- = Rk^ik- ~ ( 0 , RkGl.R't, 


i = 1 , 


,(A,-1), A; = 1,...,A, 


gi 


( 9 ) 


where we dehne Rk to be the Cholesky decomposition of Rk such that RkR'k = Rk, where 
Rk = R {pk) is a temporal correlation matrix based on an autoregressive order 1 — or AR(1) 
— structure with correlation parameter pk and G^. ^ is the Nt x Nt diagonal matrix with 
elements {Gt}k ^ for t = 1,..., Nt. Equivalently, we can dehne r/^.. ~ N (0, where 


Y^ri — 


R*i,i 0 0 


1 

o 

o 


p* p* 

-^1,1 ■ ■ ■ ^Nt,l 

0 


0 0 


0 

p* p* 

■ ■ ■ ^Nt,Nt 


o 

o 


0 0 R*Nt,Nt 


( 10 ) 


where denotes the Ng x Ng diagonal matrix with elements for k = 1,..., Ng. 


Finally, we let and dehne Z in the form of an MCAR(l,S,j) of Gelfand and 


Vounatsou (2003) with a conditional and (improper) joint distribution of 


Zj.. Z(j).., Gi, ..., G^t, P ~ iV — y, — 

\ 

TT (Z I Gi,..., Gat^, p ) oc exp 


( — V Zj, —Eg ) , i = l,...,Ng 
\ mi mi I 

\ / 


-tz'{(D-»') 0 E-‘}Z 


( 11 ) 

(12) 


respectively. We denote the expression in (12) as Z ~ MSTCAR (Gi,..., G^^, p). 















3.3 Hierarchical model 


We complete the hierarchial model by specifying prior distributions for the remaining pa¬ 
rameters. As is common in Bayesian modeling, we place a flat, noninformative prior on /3fc, 
and, following Gelman ( 2006| , assume an improper uniform prior over the positive real num¬ 
bers for Tfc. For each of the spatio-temporal covariance matrices, Gt, we assume an inverse 
Wishart distribution with positive dehnite scale matrix G and u > Ng — 1 degrees of free¬ 
dom, and we use Beta priors for each of the p^s. Finally, as many rural counties (particularly 
in the north-central states) have no data from the black populations, we decompose Y as 
Yc = (Yq, Y'^y, where Yq denotes the vector of counties with observed populations and Y„ 
denotes the vector of counties with unobserved populations. Putting these pieces together, 
the full hierarchical model is as follows: 


TT 


{(3,Z,Gi,...,Gt,p,TyYu\Yo) (xN{Y\X(3 + Z,Ey) x MSTCAR (Z | Gi, ..., Gat,, p) 

JJ [Beta {ap, bp) x tt (rl)] 

k=l 
Nt 

InvWish {Gt | G, i/), 


X 


(13) 


t=i 


where the notation 7r(x) denotes the marginal distribution for a random variable x and 
7r{x I y) denotes the conditional distribution of x given y. Here, Sy is a diagonal matrix 
with elements X is the {NgNgNt x p) matrix of covariates, and vr (r|) is the density 
for r| which corresponds to a flat prior for r^,. In cases where it may be difficult to learn 
about each Gt or each pk, we may consider putting additional structure on the priors for these 


parameters. Note that in (13), Y„ is treated as an unknown model parameter, and thus each 


Yikt £ Y^l is sampled from ([^ during each iteration of the MCMC algorithm. Furthermore, 
we assign a small value for each riikt in the set {riikt '■ Ytkt G Y„}. A detailed derivation of 
the MCMC sampler used for this analysis, as well as a description of the benehts of using 
an AR(1) model to account for temporal correlation, can be found in Web Appendix A. 
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4 Simulation Study 


To evaluate the ability of our model to accurately estimate all of our model parameters, 
we devised two simulation studies, each comprised of L = 100 sets of data generated using 
our MSTCAR model with Nt = 10 timepoints, Ng = 3 groups, and the Ng = 58 counties 
of California as our spatial domain. This spatial domain offered a compromise between 
creating a computationally feasible simulation study (compared to using all 3,099 county 
equivalents) while representing a state with a moderate number of counties and variation in 
population density and geographic spread. The hrst simulation study assumes that Uikt = n 
for all combinations of {i, k,t), allowing us to focus on parameter estimation irrespective of 
the amount of information each county can provide. We will then relax this assumption by 
generating data using actual populations of California counties. 

In each simulation study, performance was primarily assessed via coverage (i.e., the per¬ 
cent of 95% credible intervals (95% Cl) which cover the true parameter values) where values 
near 95% are desired. Furthermore, we will compare results from the MSTCAR model pro¬ 
posed here to those obtained using a separable model. While the separable model will be 
incapable of providing accurate estimates for the many additional parameters which com¬ 
prise Tig, the focus here will be on model £t. Specihcally, we will compare the coverage of 


Z and the deviance information criterion (DIC) of Spiegelhalter et ah (2002), where lower 


values indicate a better compromise of model £t and model complexity. 


4.1 Equal population sizes 

The Rh dataset is created by generating N(z'§,,rl) where t| = 1 for fc = 1...., ]V, 


and is drawn from the MSTCAR model in (12). To do this, we hrst let p = (0.8, 0.85, 0.90)' 


and generated samples of Gt from an inverse Wishart distribution with 2 * -|- 1 degrees of 

freedom and scale matrix 20 * * JAr^, where is the identity matrix of size Ng. Using 

these parameters to construct Tg (from which all L datasets are based), we generated our 
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latent variables r][^} ~ N (0, S^). From these, we used the methods described in 


Rue and 


Held (2005) to generate our specihcally, we found the eigenvalues and eigenvectors of 


the matrix D — W (based on the adjacencies of counties in California) and used the linear 
dependence of the eigenvectors to generate our spatial structure. Each simulated dataset is 


then analyzed using the hierarchical model in (13) using MCMC. Using the priors described 


in the previous section, we initialized all of our parameters (including Z) at their true values, 
resulting in chains which were quick to converge and allowing us to assess the performance 
of our model using just 1,500 iterations of our MCMC algorithm, the last 500 of which were 
used as the basis for our results. In order to better visualize these results, we also display 
results from an arbitrarily selected dataset. 

Overall, our model performed quite well. Collectively, the Zi^t were well estimated, 
as demonstrated in Figure [T| with our model obtaining 94.4% coverage and offering an 
improvement in DIC in 82 of the 100 datasets. This accuracy is permitted due in part to 
the flexibility of our model to allow for temporally evolving Gt- As shown in Figure the 
randomly generated Gt exhibited some irregular behavior. While the MSTCAR model was 
able to estimate these Gt quite well — with 95.4% coverage for the diagonal elements (i.e., 
the variances) and 95.3% coverage for the off-diagonal elements (i.e., the covariances) — 
the separable model fails to accommodate such a nuanced multivariate structure. We also 
achieved accurate estimates of the error variances, r|, for which we obtained an average of 
91.3% coverage. In contrast, coverage for pk was less than ideal (85%). 


4.2 Varying population sizes 

In our second simulation study, we generated data using the same design as described in 
Section 4.1[ but here we assigned riij^t to be the population of the ith county at time t for 
the following subpopulations: white men {k = 1), white women {k = 2), and black men and 
women {k = 3). While white men and women have Uikt > 200 in all counties for all time 
periods, there are many counties with small black population sizes. As such, we combine 
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Time 


Time 


Figure 1: Selected Zj^.. curves from one dataset of the first simulation study. Plots in the 
same row correspond to the same county, and plots in the same column correspond to the 
same group. Black lines denote posterior medians, red circles denote true values, and gray 
bands denote the 95% Cl. 


black men and women to limit the number of counties with no data. In cases where a county 
has no population during time t, however, we assume riikt = 1 and treat Yn^t as missing. 

Our model was again able to obtain accurate estimates for the Zikt and the various 
elements of the Gt while outperforming the separable model in all 100 datasets (based on 
DIG). Furthermore — aside from an expected increase in the width of the credible intervals 
— there does not appear to be any degradation in the estimation for these parameters as we 
shift from the well-populated groups to the third, less populated group. Unfortunately, our 
model again performs less well with respect to the temporal correlation parameters, pk- It is 
understandable, though, how the problem from our hrst example would be exacerbated here, 
as the amount of information provided by each group depends on the county populations. 


4.3 General findings 

In both simulation studies, the MSTCAR was able to obtain accurate estimates of both the 
Zikt and the Gt- While the nonseparable model offered improved DIG when compared to the 
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Figure 2: Estimated Gt-k,k' from one dataset of the first simulation study. Top row displays 
the diagonal elements, while the bottom row displays the off-diagonal elements. Black lines 
denote posterior medians, red circles denote true values, and gray bands denote the 95% Cl. 
For comparison purposes, the green lines denote the analogous values from the separable 
model. 


separable model, it is important to note that the differences were not substantial, with just 
over a 1% reduction on average. This suggests that the key beneht of the MSTCAR model 
(with respect to model £t) is that it provides more precise results (i.e., narrower credible 
intervals) than the separable model while still achieving the desired coverage. 

Based on these results, the pk parameters appear to be difhcult to identify. As such, if 
inference on the pk is desired, it may be necessary to run our MCMC algorithms for more 
iterations and consider thinning our samples to obtain samples which are less correlated 
over the course of the chain. Another option would be to consider respecifying our priors 
for the pk- In these simulation studies, we had assumed a Beta(9,1) prior for pk, but a more 
informative prior may be appropriate, particularly in the case of varying population sizes. For 
instance, we could assume a multi-level model of the form pk Beta {vppo, Vp{l - po)) for k = 
1,..., Ng, where po ~ Beta (oq, bo) and Vp is a parameter which controls the informativeness 
of the prior. In extreme cases, we may even consider forcing pk = po, which can be induced 
by letting Vp ^ oo. In addition to improving the convergence of our MCMC algorithm. 
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this may also lead to minor computational benefits while still yielding a model that is more 
flexible than the separable model in (|^. 


5 Analysis of Heart Disease Death Rates 


We htted the nonseparable hierarchical model in (13) to the heart disease mortality data 


described in Section using covariates consisting of only an intercept term for each com¬ 


bination of 2-year time-interval and race/gender (as required, per Besag and Kooperberg 


1995), forcing the random effects to account for a substantial amount of the spatio-temporal 


variability in the data. We place a Beta(9,1) prior on each of the pk to encourage higher 
temporal correlations in the model, and we use a vague inverse Wishart prior for each of 
the Gt- We ran the MCMC algorithm with a single chain for 6,000 iterations, diagnosing 
convergence via trace plots for many of the model parameters and discarding the hrst 1,000 
iterations as burn-in. Following that, we thinned our posterior samples by removing 9 out of 
10 samples — while this is not theoretically necessary, it reduced the burden of storing ex¬ 
cess samples for our over 200,000 random effects. Estimates provided are based on posterior 
medians, and 95% credible intervals (95% Cl) were obtained by taking the 2.5- and 97.5- 
percentiles from the thinned post-burn-in samples. To determine if the burden associated 
with htting this nonseparable model was necessary, we compared our model to the separable 
model in (|^ and the Ng independent STCAR models in ([^. 

Table displays the results of our model comparison. Here, it is clear that the inde¬ 
pendent STCAR models — while computationally convenient — are inadequate for these 
data, as both the separable and MSTCAR models offer improvements in DIC of over 94,000 
units. As seen in Section]^ the separable and MSTCAR models appear to perform similarly, 
with the MSTCAR model having a DIC only 5,828 units lower. Given the evidence in the 


literature that DIC tends to favor over-htted models (e.g., Robert and Titterington, 2002), 


it remains unclear if the flexibility of the MSTCAR model is required here; nevertheless, we 
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Model 

DIG 

Pd 

STCAR 

Separable 

MSTCAR 

2,423,049 

2,334,355 

2,328,527 

32,110 

24,185 

25,699 


Table 1: Model fit comparison between the independent STCAR models, a separable model, 
and the nonseparable MSTCAR model proposed here. Lower values of DIG indicate a better 
compromise of model £t and model complexity, where is a measure of model complexity. 

will henceforth focus our attention on results from the MSTCAR model. 

Figure [^displays the expected nationwide death rate trends for each group. These trend 
lines were computed by hrst computing the posterior distribution for the expected value for 
Yikt as Yikt = + ^ikt- We then estimated the nationwide death rate for group k at 

time t by constructing the posterior for 

/ '^ikt 

A number of important hndings can be found from this hgure. First and foremost, all four of 
our race/gender groups have experienced substantial declines, with death rates being more 
than cut in half. Secondly, men of both races experience significantly higher rate of heart 
disease-related death than women. That said, men and women of both races do not decline 
at the same rate; e.g., while white men began the study as the population with the highest 
risk, they were soon surpassed by black men, whose rates appear to be relatively stagnant 
for the period from 1975-76 to 1987-88. This trend is also visible for black women. 

To illustrate the changing geographic patterns. Figure [^displays heart disease death rates 
for white men for four time-intervals. Here, we notice an interesting trend, as several major 
cities (e.g., Denver, CO; Washington, DC; Atlanta, GA; Minneapolis, MN) are consistently 
leading the charge toward lower rates of heart disease related death for white men in their 
respective regions. On the other hand, there are collections of counties in which rates are 
lagging behind, most prominently along the southern Mississippi River and much of the Deep 
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Figure 3: Heart disease death rates over time for each of the race/gender groups compared 
to the total population. Gray bands denote the 95% credible intervals for the estimates. 


South. Similar patterns can be found for the remaining race/gender groups. 

We now turn our attention to the numerous variance parameters permitted by the use 
of the nonseparable model. While in Section we presented posterior distributions for the 
elements of Gt (Figure |^, these parameters are not necessarily of direct interest as they 
are the variance parameters for v^.*, and thus they are not directly interpretable on the 
scale of the data. Instead, we need to use our posterior samples of Gt and pk to construct 


from (10). These values coincide to the conditional covariance matrix of Zj.. (when 
scaled by the number of neighbors, rrii), and thus are interpretable on the scale of the data. 
Figure displays the diagonal elements of from the nonseparable model, as compared 
to the analogous estimates from the separable model. Here, we hnd — for all race/gender 
groups — that the variability of Zikt has decreased substantially from the beginning of 
the study period to the end. More importantly, however, we note that the separable model 
severely underestimates the variance at the beginning of the study and severely overestimates 
the variance at the end. As shown in Figure B.2 of the Web Appendix, this can lead to 
oversmoothing when the rates are the highest (the 1970s) and undersmoothing when the 
rates are lowest (the 2000s), neither of which is desirable. This may be due to the fact 
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that the rates themselves decline over time. Correlations between race/gender groups are all 
non-zero, with high correlations between genders of the same race and moderate correlations 
between races; these results can be found in Figure B.3 of the Web Appendix. 


6 Discussion 


In this paper, we have proposed a nonseparable framework for the purpose of modeling a 
dataset comprised of temporally-varying county-level heart disease death rates for multiple 
race/gender populations. We evaluated the validity of the proposed methodology — referred 
to as the MSTCAR model — via simulation and demonstrated that the model was capable 
of providing a good £t to the data and obtaining accurate estimates for the many variance 
parameters. Not only did the MSTCAR model outperform two more conventional models, 
but we show our model can help control the degree of smoothing in data which undergo a 
substantial temporal evolution during the study period. 

While the methods proposed here are much more sophisticated than more commonplace 
models like those discussed in Section |3.2[ there are a number of extensions which could be 
used to enhance the MSTCAR model. For manageable values of Ng, for instance, one could 
envision models with region-specihc parameters and pik- Implementing these models 
would likely require the use of a proper CAR model (e.g., the model proposed in Section]^ 


is constructed using only A^^ — 1 latent vectors), say by replacing {D — W) in (12) with 
{D — afctlF), where a^t G [0,1) ensures propriety and akt = 1 yields the improper CAR- 
based model used here. Furthermore, one may choose to use a multi-level modeling approach 
for specifying priors for many of these parameters, such as 


Git ~ InvWish {uiGt, Vi ), Gt ~ Wish (I/i/Gq, v) , and Go ~ InvWish (z/qG, z/q) 


to facilitate additional borrowing-of-strength. Computational burden and identihability con¬ 
cerns notwithstanding, such a model would be rather intuitive to specify and construct; i.e.. 
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one could let rj^,, ~ iV(0, where is constructed as in (10) with i subscripts. Based 


on the results of Quick et al. (2015) — where the authors extended a separable space-time 
model to allow for region-specihc variance parameters — there is evidence to believe that 
models of this sort may offer substantial improvements in £t. 

For cases where Ns is large, one may also consider using dimension reduction techniques 


such as those proposed by Hughes and Haran (2013) and extended by Bradley et ah (2014). 
Unfortunately, it’s unclear whether or not this would actually result in computational gains 


in our setting without making additional assumptions, as the approach of Hughes and Haran 


(2013) removes the conditional properties which make CAR models attractive. That is, when 
implementing the MSTCAR model proposed here, one need only invert and manipulate 
matrices of rank NgNt to sample the Z,.., albeit this requires looping through each of the Ng 


areal regions. An analogous approach based on Hughes and Haran (2013), however, would 
replace this Ng loop with a single A^*A'"gA"r dimensional update, where N* <C Ng is the rank 
of the reduced spatial domain. Were we to reduce the dimension of our spatial domain from 
Ng = 3,099 to Nl = 310 (a 90% reduction), this would still require manipulating matrices 
of ra nk N*NgNt = 23,560, which would not be feasible in our setting. While one could take 
advantage of the AR(1) structure to ease the burden, this would result in the manipulation 
of matrices of rank N*Ng, which may still be too large to implement in practice without 


resorting to the shared component model of Bradley et al. (2014). 

In the immediate future, we have two primary areas for next steps. Motivated by this and 
earlier work, we aim to investigate the observed geographic disparities in heart disease death 
rates by identifying potential factors which may be associated with the patterns observed 
here. In addition to further exploring the mechanics driving heart disease death rates, we plan 
to apply a similar modeling framework to data comprised of county-level stroke-ielated death 
rates. As stroke data are typically more erratic with much lower rates of incidence, these 
data will present additional challenges. In particular, the normal approximation used in this 
analysis will be less appropriate; as such, we aim to explore the possibility of implementing 
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this methodology in a log-linear modeling framework using a Poisson likelihood. 
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Figure 4: Estimated expected heart disease death rates (per 100,000) for white men for selected time-intervals. 
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Figure 5: Estimated diagonal elements of for the heart disease mortality data for our four 
race/gender groups. Black lines denote posterior medians from the nonseparable model while 
gray bands denote 95% CL For comparison purposes, the red line indicates the analogous 
value from the separable model which is constant over time. 
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